Revised PLWAP Tree with Non-frequent Items for Mining Sequential Pattern

نویسندگان

  • R. Vishnu Priya
  • A. Vadivel
چکیده

Sequential pattern mining is a challenging task in data mining area with large applications. One among those applications is mining patterns from weblog. Recent times, weblog is highly dynamic and some of them may become absolute over time. In addition, users may frequently change the threshold value during the data mining process until acquiring required output or mining interesting rules. Some of the recently proposed algorithms for mining weblog, build the tree with two scans and always consume large time and space. In this paper, we build Revised PLWAP with Non-frequent Items (RePLNI-tree) with single scan for all items. While mining sequential patterns, the links related to the nonfrequent items are not considered. Hence, it is not required to delete or maintain the information of nodes while revising the tree for mining updated transactions. The algorithm supports both incremental and interactive mining. It is not required to re-compute the patterns each time, while weblog is updated or minimum support changed. The performance of the proposed tree is better, even the size of incremental database is more than 50% of existing one. For evaluation purpose, we have used the benchmark weblog dataset and found that the performance of proposed tree is encouraging compared to some of the recently proposed approaches. Keywords—Sequential pattern mining; Weblog; Frequent and Non-frequent items; Incremental and Interactive mining

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Web Sequential Patterns Incrementally with Revised PLWAP Tree

Since point and click at web pages generate continuous data stream, which flow into web log data, old patterns may be stale and need to be updated. Algorithms for mining web sequential patterns from scratch include WAP, PLWAP and apriori-based GSP. An incremental technique for updating already mined patterns when database changes, which is based on an efficient sequential mining technique like ...

متن کامل

A Web Log Frequent Sequential Pattern Mining Algorithm Linked WAP-Tree

Web log frequent sequence pattern mining is an important field of Web log mining and of discovering interactive frequent sequence pattern between users and websites. It is easy to analyse users’ access sequence patterns by utilizing these sequence patterns and it is meaningful to build an intelligent website by mining Web log frequent sequential patterns. The PREWAP algorithm proposed in the pa...

متن کامل

Mining web access patterns with first-occurrence linked WAP-trees

In this paper, we describe the concept of firstoccurrence and present a web access pattern mining algorithm based on it using a novel first-occurrence linked WAP-tree (FLWAP-tree). The first-occurrences of all symbols in the base WAP-tree of the database can be found by a pre-order traversal of a portion of the WAP-tree. The frequent patterns and their projection databases can be found quickly ...

متن کامل

Efficient Support Coupled Frequent Pattern Mining Over Progressive Databases

There have been many recent studies on sequential pattern mining. The sequential pattern mining on progressive databases is relatively very new, in which we progressively discover the sequential patterns in period of interest. Period of interest is a sliding window continuously advancing as the time goes by. As the focus of sliding window changes , the new items are added to the dataset of inte...

متن کامل

Mining of Users’ Access Behaviour for Frequent Sequential Pattern from Web Logs

Sequential Pattern mining is the process of applying data mining techniques to a sequential database for the purposes of discovering the correlation relationships that exist among an ordered list of events. The task of discovering frequent sequences is challenging, because the algorithm needs to process a combinatorially explosive number of possible sequences. Discovering hidden information fro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013